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Abstract 

I  discuss  the  role  of  economic  theory  in  empirical  work  in  development  economics  with 
special  emphasis  on  general  equilibrium  and  political  economy  considerations.  I  argue  that 
economic  theory  plays  (should  play)  a  central  role  in  formulating  models,  estimates  of  which 
can  be  used  for  coimterfactual  and  pohcy  analysis.  I  discuss  why  counterfactual  analysis 
based  on  microdata  that  ignores  general  equilibrium  and  political  economy  issues  may  lead  to 
misleading  conclusions.  I  illustrate  the  main  arguments  using  examples  from  recent  work  in 
development  economics  and  political  economy. 
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Development  economics  investigates  the  causes  of  poverty  and  low  incomes  around  the 
world  and  seeks  to  make  progress  in  designing  pohcies  that  could  help  individuals,  regions, 
and  countries  to  achieve  greater  economic  prosperity.  Economic  theory  plays  a  crucial  role  in 
this  endeavor,  not  only  because  it  helps  us  focus  on  the  most  important  economic  mechanisms, 
but  also  because  it  provides  guidance  on  the  external  validity  of  econometric  estimates,  meaning 
that  it  clarifies  how  we  can  learn  from  specific  empirical  exercises  about  the  effects  of  similar 
shocks  and  policies  in  different  circumstances  and  when  implemented  on  different  scales. 

General  equilibrium  and  political  economy  issues  often  create  challenges  for  this  type  of 
external  validity.  General  equilibrium  refers  to  factors  that  become  important  when  we  con- 
duct (or  ask  questions  involving)  counterf actuals  in  which  large  changes  are  contemplated. 
The  difficulty  lies  in  the  fact  that  such  counterfactuals  will  induce  changes  in  factor  prices  and 
technology  which  we  hold  fixed  in  partial  equilibrium  analysis  and  create  different  composition 
effects  than  in  partial  equilibrium.  Political  economy  refers  to  the  fact  that  the  feasible  set 
of  interventions  is  often  determined  by  political  factors  and  large  counterfactuals  will  induce 
political  responses  from  various  actors  and  interest  groups.  General  equilibrium  and  politi- 
cal economy  considerations  are  important  because  partial  equilibrium  estimates  that  ignore 
responses  from  both  sources  will  not  give  the  appropriate  answer  to  counterfactual  exercises. 

In  this  essay,  I  first  explain  why  it  is  important  to  think  of  external  validity  in  policy  analy- 
sis, particularly  in  development  economics,  and  the  role  of  economic  theory  in  this  exercise. 
I  then  illustrate  the  importance  of  general  equilibrium  reasoning  in  several  major  problems 
in  development  economics.  Finally,  I  argue  that  pohtical  economy  considerations  have  to  be 
central  to  any  investigation  of  development  problems  and  that  inferences  that  ignore  political 
economy  can  go  wrong.  The  online  appendix  includes  a  discussion  of  some  additional  issues 
that  arise  in  the  context  of  using  theory  in  guiding  estimation. 

Why  Development  Economics  Needs  Theory 

There  is  no  general  agreement  on  whether  we  should  use  economic  theory  for  formulating  and 
then  subsequently  attempting  to  estimate  "structural  parameters".  I  argue  that  the  answer  is 
largely  "yes"  because  otherwise  econometric  estimates  would  lack  external  validity,  in  which 
case  they  can  neither  inform  us  about  whether  a  particular  model  or  theory  is  a  useful  approx- 
imation to  reality,  nor  would  they  be  useful  in  providing  us  guidance  on  what  the  effects  of 
similar  shocks  and  policies  would  be  in  different  circumstances  or  if  implemented  in  different 
scales.    I  therefore  define  structural  parameters  as  those  that  provide  external  validity  and 


would  thus  be  useful  in  testing  theories  or  in  policy  analysis  beyond  the  specific  environment 
and  sample  from  which  they  are  derived.^  External  validity  becomes  a  particularly  challenging 
task  in  the  presence  of  general  equilibrium  and  poHtical  economy  considerations,  and  a  major 
role  of  economic  theory  is  in  helping  us  overcome  these  problems  or  at  the  very  least  alerting 
us  to  their  importance. 

To  illustrate  these  points,  consider  the  relationship  between  the  cost  of  schooling  and 
schooling  decisions.  We  can  describe  this  relationship  purely  as  a  descriptive  one,  focusing 
on  a  sample  and  looking  at  the  correlation  or  the  ordinary  least  squares  relationship  between 
these  two  variables.  For  example,  we  could  specify  the  following  "reduced-form"  relationship: 

log  (sj)  =  X-/3-a  log  {ci)  +  ^^, 

where  i  denotes  an  individual  in  the  sample,  Sj  is  years  of  schoohng,  c^  denotes  the  cost  of 
schooling  to  the  individual  resulting,  for  example,  from  forgone  earnings  and  actual  costs  of 
attending  schools,  X,-  is  a  vector  of  characteristics  of  this  individual  for  which  we  may  wish 
to  control,  and  /3  is  a  vector  of  parameters.  The  parameter  of  interest  is  a.  We  can  then  use 
ordinary  least  squares  to  estimate  /3  and  a. 

Alternatively,  we  could  start  with  an  economic  model.  In  fact,  some  simple  theories  will 
lead  to  exactly  this  equation.  Suppose,  for  example,  that  the  human  capital  of  an  individual  is 
a  frmction  of  her  level  of  schoohng.  In  particular,  suppose  that  the  human  capital  of  individual 
i  is  given  by  hi  =  sl~'^ ,  for  some  parameter  cr  £  (0,1)  and  Si  denotes  her  level  of  schooling.  She 
can  then  earn  income  equal  to  y,  =  whi,  where  w  is  the  market  wage  per  unit  of  human  capital. 
In  addition,  individual  i  has  a  cost  of  schooling  given  by  dciSj,  where  (^  is  an  unobserved 
non-monetary  cost  component  and  Ci  is  the  monetary  cost  of  schooling  for  this  individual. 
Suppose  that  individuals  maximize  net  income,  so  that  individual  i  will  choose  schooling  to 
maximize  income  net  of  the  cost  of  schoohng,  that  is  wsl~'^  —  CiQSi.  After  working  through  the 
maximization  problem,  this  model  imphes  a  relationship  identical  to  the  reduced-form  equation 
we  started  with,  but  now  the  parameter  a  corresponds  to  I/ct.'  Once  this  equation  is  derived, 
estimation  is  also  straightforward  and  can  be  performed  again  by  ordinary  least  squares. 


'See  Shadish,  Cook  and  Campbell  (2002)  on  internal  and  external  validity.  The  notion  of  external  validity, 
in  particular  the  emphasis  on  counterfactual  exercises,  £is  the  defining  characteristic  of  a  structural  parameter  is 
closely  related  to  Marschak's  (1953)  definition,  which  distinguishes  between  structural  parameters  that  provide 
"useful  knowledge"  for  understanding  the  effects  of  policy  within  a  given  sample  and/or  in  new  environments. 
It  also  clearly  presupposes  that  the  empirical  strategy  has  been  successful  in  estimating  "causal"  effects  (for 
example,  as  defined  in  Angrist,  Imbens  and  Rubin  (1996). 

"Specifically,  the  optimal  choice  of  individual  i  \s  Si  =  K  (Q^a)'^^" ,  where  in  this  case  K  =  ((1  —  (t)u))  ■ 
After  taking  logs  and  defining  e,  s  —  log^^/cr  and  a  =  1/cr,  this  gives  the  reduced-form  equation  above. 


Next  comes  the  harder  part.  We  have  seen  that  the  same  equation  can  be  posited  as  a 
reduced-form  relationship,  or  it  can  be  derived  from  an  economic  model.  But  at  the  end  it  is 
the  same  equation,  and  it  can  be  estimated  in  the  same  manner.  So  in  what  sense  can  we  think 
of  it  as  a  "structural  relationship"?  The  answer  is  related  to  the  notion  of  external  validity 
introduced  above.  Suppose  we  now  ask  the  question:  what  would  be  the  effects  of  subsidies 
to  reduce  the  cost  of  schooling,  c;,  for  a  set  of  individuals?  This  counterfactual  experiment 
could  be  motivated  by  a  potential  policy  that  is  being  contemplated  or  it  may  be  used  for 
understanding  and  testing  the  implications  of  our  theory.  The  question  might  be  for  the  same 
sample  on  which  the  initial  estimation  was  performed  or  it  could  be  for  an  entirely  different 
sample  or  population.  In  either  case,  one  answer  to  the  above  question  readily  follows  from 
using  the  estimates  of  a  to  compute  the  increase  in  the  years  of  schoohng  for  individuals  whose 
cost  of  schooling  has  declined.  But  can  we  trust  this  answer? 

If  a  is  indeed  a  structural  parameter,  then  we  should  trust  this  answer  (obviously,  subject 
to  standard  errors),  but  not  otherwise."^  To  illustrate  what  might  go  wrong  when  a  does 
not  correspond  to  a  structural  parameter,  imagine,  for  example,  that  years  of  schooling  are 
constrained  by  school  enrollments,  which  are  in  turn  constrained  by  the  sizes  of  schools.  In  this 
setting,  let  us  further  assume  that  individuals  with  low  cost  of  schooling  get  proportionately 
more  of  the  available  school  resources  (for  instance,  due  to  some  type  of  efficient  rationing).'' 
In  this  example,  we  can  still  estimate  the  relationship  between  s  and  c,  and  we  will  obtain  a 
meaningful-looking  estimate  of  a.  However,  the  estimate  wiU  lack  external  validity.  Consider  a 
pohcy  of  expanding  the  subsidy  for  schooling  to  individuals  that  does  not  change  the  constraint 
that  total  years  of  schooling  are  determined  by  the  sizes  of  schools.  Then  the  estimate  of  a 
from  the  pre-subsidy  regime  wiU  not  necessarily  inform  us  about  the  post-subsidy  relationship 
between  cost  of  schoohng  and  years  of  schooling  and  wiU  not  give  us  accurate  predictions  about 
the  impact  of  the  policy. 


^Many  empirical  equations  that  do  not  correspond  to  structural  relationships  may  nonetheless  contain  useful 
information;  they  just  cannot  be  used  for  counterfactual  policy  analysis.  We  might  simply  be  interested  in 
uncovering  correlations,  which  may  help  us  distinguish  between  theories,  since  many  relevant  theories  will  have 
implications  about  what  these  correlations  should  look  like.  This  suggests  that  it  is  often  useful  to  estimate 
reduced-form  relationships  that  do  not  have  structural  interpretations,  but  when  doing  so,  we  should  be  explicit 
about  how  they  should  be  (and  not  be)  interpreted. 

■'More  specifically,  the  constraint  on  school  enrollments  might  imply  that  total  years  of  schooling  should  be 
equal  to  S,  that  is,  ^,  Si  =  S.  Suppose  that  the  economic  relationship  Si  =  K  {<^^Ci)~  '"  still  holds  at  the 
individual  level  (i.e.,  individuals  with  low  cost  of  schooling  get  proportionately  more  of  the  available  school 
resources).   But  it  must  do  so  with  a  different  value  of  K  than  in  footnote  2.   In  particular,  the  constraint  on 

total  schooling  implies  K  =  S  [Yltei  iCi'^)~^^'']  ■  When  the  cost  of  schooling  is  subsidized,  the  underly- 
ing economic  relationship  with  the  new  definition  of  K  given  here  remains  unchanged,  but  the  reduced-form 
relationship  captured  by  our  estimating  equation  above  changes  (exactly  as  shown  by  the  above  formula  for  K). 


The  problem  described  here  is  of  course  a  version  of  the  Lucas  critique  (Lucas,  1976)  that 
reduced-form  form  relationships  will  not  be  stable  in  the  face  of  policy  interventions.  However, 
the  discussion  also  highlights  that  this  problem  is  not  simply  circumvented  by  deriving  the 
relationship  of  interest  from  an  economic  model,  unless  this  model  incorporates  the  relevant 
constraints  and  margins  of  choice.  In  the  above  example,  no  model  that  fails  to  incorporate  the 
constraint  on  total  enrollments  will  be  informative  about  counterfactuals  involving  large-scale 
interventions.  Thus,  our  confidence  in  the  implied  answers  to  policy  experiments  crucially 
depends  on  our  confidence  in  having  captured  the  appropriate  structural  relationship  with  the 
model  we  are  estimating. 

How  do  we  convince  others  and  ourselves  that  our  estimates  have  external  validity  and  can 
be  used  for  pohcy  analysis  or  for  testing  theories?  This  is  where  economic  theory  becomes 
particularly  useful.  As  a  first  step,  we  have  to  defend,  using  economic  theory,  common  sense 
and  evidence,  that  key  factors  potentially  affecting  the  response  to  the  relevant  counterfactual 
are  accounted  for,  and  the  model  and  the  functional  form  we  chose  indeed  capture  the  salient 
aspects  of  the  reahty  ("are  a  good  approximation  to  reality").  This  in  turn  involves  arguing 
that  the  functional  form  is  stable  over  time  and  across  relevant  samples,  that  variation  across 
individuals  not  captured  by  the  covariates  and  the  cost  of  schooling  can  be  incorporated  into 
the  error  term  £j,  and  that  this  error  term  can  be  modeled  as  additive  and  orthogonal  to 
(uncorrelated  with)  the  other  variables  included  in  the  equation.  Using  economic  theory  is 
often  the  best  way  of  clarifying  whether  key  factors  have  been  omitted,  and  whether  the 
underlying  assumptions  can  be  defended  and  provide  a  good  approximation  to  reality. 

However,  the  previous  discussion  also  highlights  that  specifying  a  model  that  justifies  a 
specific  estimating  equation  is  typically  not  difficult,  and  may  not  solve  the  underlying  problem. 
For  example,  we  saw  how  we  could  derive  exactly  the  same  estimating  equation  from  a  model 
of  individual  schooling  choice;  but  if  in  reality  years  of  schooling  are  constrained  by  the  sizes 
of  schools,  the  estimates  of  a  will  still  not  be  useful  for  understanding  the  imphcations  of  a 
large-scale  subsidy  for  schooling.  The  problem  of  course  is  that  for  studying  the  implications 
of  this  type  of  policy,  the  constraints  resulting  from  the  sizes  of  schools  are  central,  and  any 
model  that  does  not  recognize  these  constraints  will  not  be  helpful  in  such  a  study.  This 
emphasizes  that  the  proper  use  of  economic  theory  does  not  mean  writing  down  of  a  specific 
model;  instead,  it  requires  that  we  incorporate  the  appropriate  constraints  and  margins  of 
adjustments,  that  we  develop  the  case  that  economic  theory  robustly  leads  to  the  estimating 
equation  in  question,  and  that  we  clarify  which  important  economic  mechanisms  and  effects 


are  being  excluded  from  the  model. ^ 

Another  advantage  of  the  structural  reasoning  based  on  theory  is  that  once  we  go  through 
the  process  of  explicitly  justifying  the  equation  we  are  estimating,  either  using  economic  theory 
or  other  theoretical  or  empirical  arguments,  we  may  realize  that  such  an  equation  cannot  easily 
be  defended.  In  such  cases,  it  has  to  be  interpreted  with  greater  caution,  or  perhaps  it  has  to 
be  modified  or  abandoned.  This  advantage  becomes  particularly  important  in  contexts  where 
general  equilibrium  and  political  economy  effects  are  present.  Finally,  economic  theory  provides 
the  best  way  of  interpreting  what  the  estimates  from  an  equation,  such  as  the  one  we  started 
with,  mean.  For  example,  when  this  equation  is  derived  from  the  economic  model  above, 
we  understand  that  a  =  1/a  is  a  function  of  the  elasticity  of  the  human  capital  production 
function. 

The  structural  approach  also  faces  major  challenges,  however.  First,  as  already  emphasized, 
writing  down  a  model  like  the  one  described  above  is  clearly  not  sufficient  for  achieving  external 
validity.  That  model  itself  made  several  assumptions  which  are  restrictive  and  may  not  provide 
a  good  approximation  to  the  economic  phenomena  in  which  we  are  interested.  This  is  again 
illustrated  by  the  above  example,  which  showed  that  one  might  end  up  deriving  the  same 
estimating  equation  from  a  theoretical  model  and  thus  reach  the  same  conclusions  about  the 
imphcations  of  a  counterfactual  policy  change  as  one  might  have  done  by  just  specifying  a 
reduced-form  equation. 

Second,  we  may  in  fact  question  whether  there  is  any  ground  for  assuming  a  constant  elas- 
ticity a  between  years  of  schooling  and  costs  of  schooling.  After  all,  we  know  that  all  theories 
are  abstractions  and  approximations,  so  there  is  little  reason  to  beheve  that  a  parameter  such 
as  Q — or  the  intertemporal  elasticity  of  substitution,  or  the  Prisch  elasticity  of  labor  supply,  or 
the  elasticity  of  substitution  between  two  factors,  or  any  other  Marschakian  preference  or  tech- 
nology parameter — should  be  really  constant.  But  without  such  constancy,  there  are  severe 
limits  to  external  vahdity. 

Finally,  one  may  even  question  the  existence  or  usefulness  of  "structural  parameters"  al- 
together.   WTiat  we  take  as  a  structural  parameter  for  one  theory  will  naturally  become  an 


The  online  appendix  discusses  some  issues  that  arise  in  thinking  of  how  we  could  develop  such  robust 
predictions  and  how  we  could  try  to  map  them  to  data.  This  discussion  also  highlights  that  in  certain  cases  one 
could  achieve  counterfactual  validity  without  much  theory.  For  example,  we  need  only  the  most  basic  theory  in 
interpreting  a  controlled  experiment  designed  to  evaluate  the  effectiveness  of  a  drug.  In  this  case,  we  can  say 
that  common  sense  and  a  very  limited  amount  of  medical  theory  are  sufficient  to  interpret  the  results  of  the 
controlled  experiment  and  decide  whether  they  are  informative  about  the  effectiveness  of  the  drug  in  question 
beyond  the  experimental  setting.  It  should  also  be  noted  that  the  evaluation  of  the  effectiveness  of  a  drug  in  this 
example  has  a  clear  parallel  to  "modeling  individual  behavior"  in  economics.  As  further  discussed  below,  the 
role  of  economic  theory  becomes  even  more  central  when  our  focus  shifts  to  "modeling  equilibrium  behavior". 


endogenous  object  in  another.  So  a  particular  model  can  serve  us  well  as  an  abstraction  for 
a  series  of  counterfactual  experiments,  but  there  will  exist  other  experiments  for  which  it  will 
be  much  less  informative.  For  example,  an  elasticity  of  substitution  or  certain  technology 
parameters  may  be  constant  with  respect  to  certain  variations,  but  would  change  in  response 
to  others.  This  is  almost  by  necessity:  a  preconchtion  for  external  validity  is  that  key  factors 
relevant  for  the  outcome  of  the  counterfactual  should  be  included  in  the  model,  and  models  as 
abstractions  have  to  exclude  several  relevant  factors,  so  no  single  model  can  include  all  of  the 
relevant  factors  for  all  possible  counterfactual  exercises. 

These  challenges  notwithstanding,  it  is  clear  that  we  often  have  to  take  a  position  about 
the  parameters  being  estimated  corresponding  to  structural  parameters  (at  least  for  a  well- 
defined  though  perhaps  hmited  set  of  variations  in  environment  and  policy).  Otherwise,  we 
will  have  no  way  of  performing  counterfactual  exercises  and  making  predictions  about  policy 
changes  (see  Imbens,  2009).  But  this  necessitates  a  claim  to  external  vahdity  (even  if  it  is 
only  implicit),  and  economic  theory  is  our  best  guide  for  formulating  the  appropriate  models 
and  justifying  such  claims  to  external  validity.  These  issues  become  only  more  central  in  the 
presence  of  general  equihbrium  effects  and  political  economy  factors,  which  I  turn  to  next. 

The  Centrality  of  General  Equilibrium 

The  bulk  of  empirical  work  using  microdata,  particularly  in  development  economics,  engages  in 
partial  equilibrium  comparisons.  Depending  on  magnitudes  of  various  effects,  general  equilib- 
rium interactions  can  offset  or  even  reverse  sensible  partial  equilibrium  conclusions.  However, 
most  empirical  strategies  do  not  directly  estimate  general  equilibrium  effects.^ 

Economic  theory  nonetheless  provides  some  guidance  in  assessing  the  importance  of  general 
equilibrium  effects.  Three  types  of  general  equihbrium  effects,  which  are  usually  not  estimated 
in  partial  equihbrium  comparisons,  are  potentially  important.  First,  in  response  to  large  policy 
interventions  or  shocks,  imperfect  substitution  between  factors  and  diminishing  returns  imply 
that  factor  productivities  and  prices  will  change.  Second,  the  same  policy  interventions  or 
shocks  can  lead  to  endogenous  technology  responses.  Third,  there  may  be  composition  effects 
resulting  from  equilibrium  substitution  of  some  factors  or  products  for  others  (whereby  the 
composition  of  micro  units  changes  differently  in  response  to  different  types  of  interventions). 


See  also  Townsend  (2009)  for  a  complementary  discussion  of  the  role  of  general  equilibrium  analysis  in 
development  economics,  with  special  emphasis  on  credit  market  issues;  Heckman,  Lochner  and  Taber  (1998) 
for  a  discussion  of  general  equilibrium  issues  in  the  analysis  of  the  effects  of  technology  on  wage  inequality;  and 
Duflo  (2004a)  for  a  discussion  of  other  difficulties  in  "scaling  up"  policy  interventions  evaluated  using  microdata. 


Theory  generally  implies  that  the  first  and  the  third  effects  will  tend  to  partially  offset  or  even 
reverse  direct  partial  equilibrium  effects,  while  endogenous  technology  responses  could  either 
dampen  or  magnify  them  (see  Acemoglu,  2007,  for  general  theoretical  results  on  endogenous 
technology). 

As  an  example  of  factor  price  changes,  consider  the  problem  of  estimating  the  returns  to 
schooling.  This  is  typically  done  by  focusing  on  a  small  group  of  individuals  who  are  induced  to 
remain  in  school  for  longer  and  comparing  them  to  other  individuals  in  the  same  market  (thus 
facing  the  same  prices)  who  have  dropped  out  of  school.  The  implicit  assumption  here  is  that 
altering  schooling  decisions  will  not  generate  changes  in  market  prices.  But  for  many  of  the 
questions  relevant  for  development  economics,  we  wish  to  think  of  counterf actuals  in  which  a 
large  fraction  of  the  population  acquires  more  schooUng.  In  this  case,  it  is  no  longer  plausible  to 
assume  that  prices  will  necessarily  remain  constant.  Imperfect  substitution  between  different 
skill  levels  will  typically  imply  that  an  increase  in  the  schooling  level  of  a  significant  fraction 
of  the  population  may  reduce  the  return  to  schooling.  For  example,  Angrist  (1995)  shows  that 
the  large  school  building  programs  in  the  Palestinian  territories  led  to  a  sharp  drop  in  the  skiU 
premium.  ■         .  . 

As  an  example  of  endogenous  technology  responses,  consider  the  large  increase  in  the 
relative  supply  of  college-educated  workers  in  the  United  States  starting  in  the  late  1960s. 
Given  technology,  this  change  in  relative  supply  should  have  reduced  the  college  premium.  As 
is  well  known,  the  opposite  happened  in  practice,  and  the  college  premium  increased  sharply 
from  the  late  1970s  onwards.  Acemoglu  (1998)  argues,  for  example,  that  this  was  a  consequence 
of  the  endogenous  response  of  technology  to  the  relative  abundance  of  more  skilled  workers.  The 
same  reasoning  implies  that  in  evaluating  the  effect  of  trade  opening,  one  could  not  simply  rely 
on  partial  equilibrium  estimates  derived  from  firm-level  variation  in  access  to  foreign  markets, 
since  trade  opening  is  a  general  equilibrium  change  that  will  also  affect  technology  choices  and 
the  direction  of  technological  change. 

As  an  example  of  composition  effects,  consider  the  problem  of  estimating  the  importance  of 
credit  market  imperfections.  Banerjee  and  Dufio  (2005)  survey  a  large  body  of  evidence  that 
small  and  medium-sized  businesses  in  less-developed  economies  are  credit  constrained  and  an 
extension  of  credit  to  these  businesses  will  make  them  increase  production.  Now  consider 
the  effect  of  a  large-scale  policy  of  credit  expansion  to  small-  and  mediim:i-sized  businesses. 
This  pohcy  could  lead  to  a  different  type  of  composition  effect  than  the  one  operating  in 
partial  equilibrium.  For  example,  it  may  be  the  case  that  in  partial  equiUbrium  estimation 
focusing  on  firm-level  variation  we  found  that  firms  with  better  access  to  credit  expanded,  but 


this  was  at  the  expense  of  other  firms  that  did  not  have  access  to  credit  (that  is,  partly  by 
"steaUng  business"  from  others).  And  yet,  the  same  response  cannot  take  place  in  general 
equilibrium.  As  a  consequence,  when  additional  credit  becomes  available  to  a  large  fraction  of 
firms,  total  output  may  not  increase  by  as  much  or  at  all.  One  could  thus  imagine  a  situation 
in  which  partial  equilibrium  estimates  of  relaxing  credit  constraints  are  large,  while  the  general 
equilibrium  eflFects  would  be  small. 

I  now  further  elaborate  the  first  general  equilibrium  effect,  working  through  endogenous 
factor  prices  and  diminishing  returns,  in  the  context  of  the  effect  of  life  expectancy  (and  health) 
on  economic  growth.  A  large  microeconometric  literature  shows  that  healthier  individuals  are 
more  productive:  see,  among  others,  Behrman  and  Rosenzweig  (2004),  Schultz  (2002),  and 
Straus  and  Thomas  (1998).  On  this  basis,  we  would  expect  an  increase  in  the  life  expectancy 
of  the  workforce  to  lead  to  greater  aggregate  productivity.  But  one  should  take  general  equi- 
librium effects  into  account,  since  an  increase  in  hfe  expectancy  also  increases  population,  and 
because  of  diminishing  returns  to  capital  and  land,  it  may  decrease  labor  productivity  and 
may  in  fact  reduce  income  per  capita.  How  could  one  investigate  whether  these  general  equi- 
librium effects  are  important?  One  approach  is  to  use  information  from  other  sources  in  order 
to  "calibrate"  the  values  of  the  parameters  and  then  combine  this  with  micro  estimates  of  the 
effect  of  health  and  life  expectancy  on  individual  outcomes.''  This  approach  will  be  successful 
when  we  can  have  confidence  in  the  calibration  exercise. 

A  second  approach  is  to  use  cross-country  variation,  even  though  such  variation  will  be 
affected  by  several  potentially  omitted  factors.  Acemoglu  and  Johnson  (2007)  adopt  this 
approach.  They  derive  the  following  linear  relationship  between  log  life  expectancy,  xu,  and 
log  income  per  capita,  yu,  from  a  neoclassical  growth  model  and  the  possibility  that  life 
expectancy  might  have  a  chrect  positive  effect  on  technology  and  on  human  capital: 

yu  =  ^^X^t  +  Ci+  P^t  +  ^it- 

The  parameter  of  interest,  tt,  measures  the  relationship  between  log  income  per  capita  and  log 
life  expectancy.  Though  this  equation  can  be  estimated  by  ordinary  least  squares,  this  is  likely 
to  lead  to  biased  estimates  of  tt,  since  societies  that  are  successful  in  solving  economic  and 


'This  is  the  approach  advocated  by  Banerjee  and  Duflo  (2005)  and  used  by  Weil  (2007)  in  the  context  of 
health  and  economic  development  and  by  Heckman,  Lochner  and  Taber  (1998)  in  the  context  of  the  relationship 
between  technology  and  wage  inequality.  Another  approach  not  mentioned  in  the  text,  perhaps  most  promising, 
is  to  combine  microdata  with  regional  variation  to  estimate  partial  and  general  equilibrium  effects  simultaneously. 
This  approach  is  adopted  and  developed  in  Acemoglu  and  Angrist  (2000)  to  estimate  human  capital  externalities 
exploiting  individual-level  differences  in  schooling  together  with  state-wide  differences  in  average  schooling  (see 
also  Dufio,  2004b,  for  an  application  to  Indonesian  data)  and  in  Acemoglu,  Autor  and  Lyle  (2004)  to  estimate 
the  general  equilibrium  effects  of  increased  female  labor  supply  (on  male  and  female  wages). 


institutional  problems  to  achieve  higher  growth  are  also  likely  to  provide  better  public  health 
and  other  measures  that  improve  life  expectancy,  and  also  the  increase  in  income  per  capita  is 
likely  to  lead  to  a  mechanical  improvement  in  hfe  expectancy.  .     , 

To  overcome  this  problem,  Acemoglu  and  Johnson  (2007)  adopt  an  instrmnental-variables 
strategy,  exploiting  global  discoveries  and  diffusion  of  major  drugs,  chemicals,  and  public  health 
technologies.  The  idea  is  that  these  improvements  should  have  raised  life  expectancy  differen- 
tially in  countries  that  were  subject  to  different  types  of  initial  disease  burdens.  To  implement 
this  idea,  they  construct  a  "predicted  mortality"  variable,  Mu,  based  on  the  15  most  infectious 
diseases  in  1940.  They  compute  the  pre-intervention  (1940)  mortality  from  each  of  these  15 
diseases  in  each  country.  Then,  when  a  global  health  intervention  (technological  breakthrough) 
takes  place  for  a  given  disease,  predicted  mortality  in  each  country  falls  to  a  different  level  de- 
pending on  their  pre-intervention  mortahty  from  that  disease.  More  specifically,  the  predicted 
mortality  variable  uses  a  country's  initial  mortality  rate  for  each  of  the  15  diseases  until  there 
is  a  global  intervention,  and  after  the  global  intervention,  the  mortality  rate  for  the  disease  in 
question  declines  to  the  frontier  mortality  rate.''  Predicted  mortahty,  Mn,  is  then  used  as  an 
instrument  for  log  life  expectancy  in  the  estimation  of  the  relationship  between  log  income  per 
capita  and  log  life  expectancy.  With  this  reasoning,  the  first-stage  relationship  is 

-  Xit  =  TpMit  +Ci  +  jJ-t+  Uit- 

For  this  instrumental-variables  approach  to  be  valid,  the  key  exclusion  restriction  for  the 
estimation  strategy  is  the  covariance  between  the  predicted  mortality  variable.  Ma,  and  the 
error  term  in  the  earlier  income  per  capita  equation,  Sit,  must  be  zero  (i.e.,  Cov{Mit,£it)  =  0). 
Note  that  both  the  second  and  the  first  stages  (the  exclusion  restriction)  are  motivated  by 
theory.  The  second  stage  is  derived  from  the  neoclassical  growth  model.  The  first  stage 
(and  thus  the  exclusion  restriction  that  Cov{AIit,eit)  =  0)  is  predicated  on  the  theory  that 
global  intervention  for  a  particular  disease  will  affect  mortality  in  a  country  in  proportion 
with  the  number  of  initial  deaths  from  the  disease  in  question  in  that  country,  and  more 
importantly,   that  baseline  levels  of  mortahty  from  different  diseases  do  not  have  a  direct 


Mathematically,  predicted  mortality  is  defined  as 


where  Mdit  denotes  mortality  in  country  i  from  disease  d  at  time  t,  Idt  is  a  dummy  for  intervention  for  disease 
d  at  time  t  (it  is  equal  to  1  for  all  dates  after  the  intervention),  V  denotes  the  set  of  the  15  infectious  diseases, 
Mdito  refers  to  the  pre-intervention  (1940)  mortality  from  disease  d  in  the  same  units,  and  MdFt  is  the  mortality 
rate  from  disease  d  at  the  health  frontier  of  the  world  at  time  t. 


effect  on  future  income  beyond  their  effect  working  through  future  life  expectancy  and  health 
conditions.  Acemoglu  and  Johnson  (2007)  provide  evidence  consistent  with  this  exclusion 
restriction.  For  example,  prior  to  1940  predicted  mortality  does  not  predict  future  income 
or  population  growth,  which  is  consistent  with  the  notion  that  past  levels  of  life  expectancy 
do  not  have  a  direct  effect  on  future  growth.  The  online  appendix  discusses  why  the  specific 
instrumental-variables  strategy  suggested  here  is  only  valid  with  certain  formulations  of  the 
second-stage  equation,  and  different  theories  for  the  relationship  between  health  and  growth, 
encapsulated  in  different  second-stage  relationships,  may  not  be  consistent  with  the  same 
exclusion  restriction,  thus  further  emphasizing  the  role  of  theory  in  guiding  the  estimation 
strategy. 

The  surprising  finding  in  Acemoglu  and  Johnson  (2007)  is  this  that  despite  the  well- 
estabhshed  positive  micro  estimates  of  the  effect  of  health  on  productivity,  in  general  equi- 
librium the  effect  on  income  per  capita  appears  to  be  negative.  This  result  probably  arises 
because  the  improvements  in  Ufe  expectancy  were  associated  with  very  large  increases  in  pop- 
ulation. While  this  conclusion  comes  with  several  caveats,  not  least  because  the  negative 
estimates  are  often  quite  large  and  come  from  a  specific  episode  (during  which  mortality  rates 
may  have  declined  unusually  rapidly  relative  to  morbidity  rates),  it  illustrates  the  possibility 
that  general  equilibrium  empirical  conclusions  can  be  quite  different  from  partial  equilibrium 
ones.^  It  reiterates  the  importance  of  incorporating  general  equilibrium  considerations  for 
conducting  counterfactual  exercises  concerning  the  effects  of  large  changes  in  variables  such 
as  schooling,  health  conditions  or  access  to  credit  on  income  per  capita  or  other  aspects  of 
economic  development. 

No  Development  without  Political  Economy 

There  is  increasing  recognition  that  institutional  and  political  economy  factors  are  central  to 
economic  development.  Many  problems  of  development  result  from  barriers  to  the  adoption  of 
new  technologies,  lack  of  property  rights  over  land,  labor  and  businesses,  and  pohcies  distorting 
prices  and  incentives.  These  institutions  and  policies  are  not  in  place  exclusively,  or  even 
primarily,  because  of  a  lack  of  understanding  of  economic  principles  on  the  part  of  pohcymakers. 
Typically,  policymakers  introduce  or  maintain  such  policies  to  remain  in  power,  or  to  enrich 
themselves,  or  because  politically  powerful  elites  oppose  the  entry  of  rivals,  the  introduction  of 


'The  conclusions  may  also  depend  on  the  fact  that  Acemoglu  and  Johnson  (2007)  focus  on  changes  in  health 
largely  (though  not  solely)  associated  with  mortality.  Bleakley  (2007),  focusing  on  changes  related  to  morbidity, 
obtains  different  results. 
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new  technologies,  or  improvements  in  the  property  rights  of  their  workers  or  competitors  (for 
example,  Acemoglu,  Johnson  and  Robinson,  2005a).  But  this  perspective  implies  that  theory 
again  becomes  particularly  important  in  evaluating  (or  framing)  possible  effects  of  large-scale 
policy  interventions;  counterfactual  analyses  that  ignore  political  economy  factors,  like  those 
that  do  not  take  account  of  general  equilibrium  effects,  may  give  misleading  answers.  In  this 
case,  convincing  micro  or  even  macro  (general  equilibrium)  evidence  about  the  effects  of  a 
particiilar  policy  change  on  economic  outcomes  is  not  in  itself  sufficient  to  gauge  what  the 
imphcations  will  be  when  such  a  policy  is  encouraged  or  implemented. 

The  experience  of  Ghana  with  exchange  rate  policy  under  Prime  Minister  Kofi  Busia  in 
1971  provides  a  sharp  illustration.  Busia  pursued  expansionary  economic  policies  after  coming 
to  power  in  1969,  and  maintained  various  price  controls  and  an  overvalued  exchange  rate.  But 
Ghana  was  soon  suffering  from  a  series  of  balance  of  payments  crises  and  foreign  exchange  short- 
ages. Faced  with  these  crises,  Busia  signed  an  agreement  with  the  IMF  on  December  27,  1971, 
which  included  a  massive  devaluation  of  the  currency.  A  few  days  following  the  announcement 
of  the  devaluation,  Busia  was  overthrown  by  the  military  led  by  Lt.  Col.  Acheampong,  who 
immediately  reversed  the  devaluation  (see,  for  example,  Herbst,  1993;  Boafo-Arthur,  1999). 
There  was  httle  doubt  that  devaluation  was  good  economics  in  Ghana.  But  it  was  not  good 
pohtics.  State  controls  over  prices,  wages,  marketing  boards  and  exchange  rates  were  an  im- 
portant part  of  the  patronage  network,  and  any  politician  who  lost  the  support  of  this  network 
was  susceptible  both  at  the  polls  and  against  the  military.  Busia  suffered  this  fate. 

This  episode  illustrates  a  general  point:  When  political  economy  factors  are  important, 
evidence  on  the  economic  effects  of  large-scale  policy  changes  under  a  given  set  of  political 
conditions  is  not  sufficient  to  forecast  their  effect  on  the  economy  and  society.  This  principle 
does  not  just  apply  to  exchange  rate  policy.  For  example,  the  fact  that  increasing  availabil- 
ity of  credit  to  firms  would  increase  aggregate  output  given  all  other  policies  does  not  imply 
that  an  actual  reform  of  the  credit  market  will  necessarily  work.  Consistent  with  this  per- 
spective, Haber  and  Perotti  (2008)  argue  and  provide  evidence  that  limiting  access  to  finance 
is  a  powerful  tool  in  the  hands  of  pohtical  and  economic  elites  for  restricting  entry  into  lu- 
crative businesses.  Thus,  reforms  of  credit  markets  will  often  face  political  opposition  from 
powerful  parties,  and  even  when  they  are  implemented,  this  implementation  may  be  imperfect 
or  accompanied  by  other  policies  aimed  at  nullifying  the  effects  of  the  reform.  This  type  of 
endogenous  policy  response  undermining  the  objectives  of  a  reform  is  termed  the  seesaw  effect 
in  Acemoglu,  Johnson,  Robinson  and  Querubi'n  (2008),  who  provide  evidence  that  the  reforms 
aimed  at  reducing  inflation  by  granting  independence  to  the  central  bank  typically  do  not  work 
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in  societies  with  weak  institutions  and  sometimes  trigger  other  pohcy  responses — for  example, 
larger  government  deficits — to  mido  the  reduction  in  the  ability  of  the  government  to  provide 
favors  to  politically  powerful  groups. 

There  are  many  parallels  between  the  implications  of  general  eqmlibrium  effects  for  the 
interpretation  (and  extrapolation)  of  partial  equilibrium  estimates  discussed  in  the  previous 
section  and  the  implications  of  political  economy  factors.  Even  though  in  general  we  have  less  of 
an  understanding  of  the  channels  of  influence  of  political  economy,  a  general  principle  provides 
a  useful  starting  point:  large-scale  shocks  and  poUcy  interventions  will  create  political  economy 
responses  from  those  who  see  their  economic  or  political  rents  threatened  or  from  those  that 
see  new  options  to  increase  these  rents.  Such  responses  are  the  basis  of  all  three  examples 
mentioned  so  far:  the  overthrow  of  Busia,  potential  obstacles  to  credit  market  reform,  and  the 
seesaw  effect.  The  difficulty  lies  in  the  fact  that  which  groups  and  individuals  will  be  able  to 
mobilize  and  respond  to  these  changes  will  vary  across  different  applications. 

How  should  empirical  research  in  economic  development  take  political  economy  into  ac- 
count? A  first  step  would  be  to  use  empirical  work  to  understand  better  the  role  of  political 
economy  factors  in  development.  This  type  of  research  on  empirical  political  economy  of  de- 
velopment is  relatively  new.  The  first  generation  of  work  focused  on  cross-country  variation 
(see  the  overview  in  Acemoglu,  Johnson  and  Robinson,  2005a).  Although  research  in  this 
area  is  expanding,  given  the  importance  of  political  economy  for  the  problems  of  development, 
it  remains  surprising  how  few  papers  investigate  important  political  economy  channels  using 
microdata  and  careful  empirical  strategies.  I  now  discuss  a  few  of  these  papers  to  give  a  sense 
of  what  approaches  are  available. 

Low  agricultural  productivity  throughout  the  developing  world  is  a  major  problem,  and 
also  a  puzzle.  In  many  instances,  "fallowing,"  plowing  the  land  but  leaving  it  unseeded  for 
a  period  of  time  so  as  to  reduce  weed  growth  and  conserve  soil  moisture,  would  increase 
productivity  considerably.  Goldstein  and  Udry  (2008)  document  that  in  southern  Ghana  the 
amount  of  fallowing  is  massively  insufficient.  A  non-political  economy  answer  would  be  to 
encourage  fallowing.  But  in  reality,  this  recommendation  (or  policy)  would  be  incorrect  or  at 
least  seriously  incomplete,  because  Goldstein  and  Udry  show  that  fallowing  increases  the  risk 
of  confiscation  of  land  by  powerful  chiefs  and  other  connected  individuals.  In  fact,  those  with 
sufficient  political  power,  who  presumably  face  a  lower  risk  of  confiscation,  choose  significantly 
higher  levels  of  fallowing.  This  finding  illustrates  both  the  importance  of  secure  property  rights 
and  the  role  of  political  economy  constraints  on  productive  investments.  It  also  highlights  the 
role  of  local  power  structures  in  villages  in  shaping  the  security  of  property  rights  and  incentives 
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for  investment. 

More  work  is  needed  on  understanding  how  the  pohtical  economy  context  is  shaped.  An 
emerging  hterature  investigates  these  issues  using  microdata.  As  one  example,  Ferraz  and 
Finan  (2008)  use  audit  reports  from  an  anti-corruption  program  in  Brazil  to  estimate  the 
effect  of  electoral  accountability  on  corruption  and  misappropriation  of  funds  by  politicians. 
They  find  that  mayors  who  cannot  get  reelected  because  of  term  limits  are  significantly  more 
corrupt  and  misappropriate  27  percent  more  resources  than  mayors  with  reelection  incentives. 
They  also  show  that,  consistent  with  theory,  these  effects  are  stronger  when  voters  have  access 
to  less  information  and  when  judicial  punishment  against  corruption  is  weaker.  In  a  related 
paper,  Ferraz  and  Finan  (2009)  study  the  effects  of  politician  salaries  on  politician  behavior 
and  quality  of  public  services.  They  exploit  a  discontinuity  in  the  salaries  of  local  politicians 
across  Brazilian  municipalities  resulting  from  a  constitutional  amendment  imposing  salary  caps 
depending  on  the  size  of  municipal  population.  Using  regression  discontinuity  techniques,  they 
find  that  greater  salaries  are  associated  with  greater  competition  among  potential  candidates, 
and  moreover  that  the  quaUty  of  the  elected  legislatures  measured  by  education  or  experience 
improves.  Higher  salaries  are  also  associated  with  improvements  in  various  dimensions  of 
politician  performance.^'' 

Another  approach  is  to  assess  the  extent  to  which  past  historical  institutions  have  long-run 
effects.  Acemoglu,  Johnson  and  Robinson  (2005a)  summarize  several  cross-country  studies 
suggesting  that  certain  major  events  such  as  the  foundation  of  colonial  institutions  or  the 
separation  of  the  Koreas  can  have  persistent  effects.  However,  controlling  for  confounding 
factors  is  often  difficult  in  cross-country  studies  and  the  exact  mechanism  leading  to  persistent 
effects  is  often  difficult  or  impossible  to  pinpoint.  Recent  work  by  Dell  (2009)  focuses  on 
the  potential  effects  of  the  forced  labor  system  used  by  the  Spanish  colonial  government  in 
Peru  and  Bolivia.  This  system,  which  forced  a  large  fraction  of  the  adult  male  population  of 
villages  near  the  Potosi  silver  and  Huancavelica  mercury  mines  to  work  in  these  mines,  was 
used  extensively  in  the  sixteenth  century,  and  was  abofished  in  1812.  Those  inside  and  outside 
the  boundary  of  the  catchment  area  of  the  forced  labor  program  were  subject  to  different  labor 
regulations.  In  a  regression  discontinuity  design,  Dell  finds  that  areas  subjected  to  forced  labor 
more  than  200  years  ago  now  have  about  one-third  lower  household  equivalent  consumption. 
The  available  data  also  allow  an  investigation  of  some  potential  mechanisms  for  this  very  large 


'  Returning  to  the  contraist  between  different  counterfactual  exercises,  one  might  question,  for  example, 
whether  this  regression  discontinuity  estimate  would  be  informative  about  the  effects  of  a  large-scale  increase  in 
politician  salaries,  which  might  cause  different  composition  effects  than  cross-municipality  variation  in  salaries 
induced  by  salary  caps. 
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and  persistent  effect,  which  appears  to  be  related  to  lack  of  public  goods  in  areas  subject  to 
forced  labor.  This  lack  of  public  goods  in  turn  may  be  related  to  the  policies  of  the  Spanish 
governments  to  limit  competition  for  labor  in  the  catchment  areas  from  private  landholders 
and  businesses. 

Finally,  again  related  to  the  issue  of  coercion,  Naidu  and  Yuchtman  (2009)  investigate 
how  the  ability  of  employers  to  imprison  or  fine  an  employee  for  breach  of  contract  under 
the  Master  and  Servant  Acts,  which  remained  in  effect  in  Britain  until  1875,  affected  labor 
market  relations.  They  provide  evidence  that  employers  made  extensive  use  of  their  coercive 
ability  under  the  law,  and  as  a  consequence,  labor  demand  shocks  were  largely  met  by  using 
increased  persecutions  for  contract  breach  rather  than  higher  wages.  This  finding  is  consistent 
with  theoretical  predictions  of  recent  models  of  labor  coercion  such  as  Acemoglu  and  Wolitzky 
(2009). 

Overall,  the  above-mentioned  papers,  though  distinct  in  methodology  and  scope,  show  how 
microdata  and  regional  variation  in  institutions  and  laws  can  shed  light  on  the  role  of  political 
economy  factors  in  development.  Empirical  work  in  development  economics  should  pay  more 
attention  to,  and  build  a  more  systematic  understanding  of,  political  economy.  It  must  also 
study  how  different  counterfactual  and  policy  experiments  will  interact  with  or  be  resisted  by 
political  factors.  -         :    .     . 

Concluding  Remarks 

A  key  objective  of  empirical  work  in  development  economics  is  to  discriminate  between  the- 
ories about  the  causes  of  economic  growth  and  to  conduct  counterfactual  analysis  to  build 
a  systematic  understanding  of  how  an  economy  will  respond  to  large  changes  in  factor  sup- 
plies, technology  or  policy.  Economic  theory  is  central  in  this  endeavor.  In  fact,  economic 
theory  becomes  more  important  in  the  presence  of  general  equihbrium  and  political  economy 
considerations. 

General  equihbrium  and  political  economy  effects  are  often  difficult  to  estimate  or  to  quan- 
tify. However,  they  are  pervasive  and  essential  for  important  questions  in  development  eco- 
nomics. Most  research  in  economics  has  (and  should  have)  a  narrow  focus  and  tries  to  investi- 
gate a  particular  set  of  factors  in  a  specific  context.  But  in  development  economics  where  the 
agenda  ought  to  be  broad,  we  should  also  not  lose  sight  of  the  bigger  picture  of  the  problem  of 
economic  development.  This  imphes  that  we  should  strive  to  incorporate  general  equilibrium 
and  political  economy  effects  when  we  can,  and  we  should  be  cognizant  of  their  importance 
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when  we  cannot. 

It  is  also  useful  to  note  that  general  equiUbrium  and  political  economy  considerations  are  not 
only  a  constraint  in  policy  analysis,  even  though  I  focused  on  cases  in  which  these  considerations 
tend  to  offset  or  reverse  partial  equilibrium  effects.  For  example,  the  endogeneity  of  political 
economy  responses  also  implies  that  certain  economic  policies  and  shocks  might  have  more 
beneficial  effects  than  what  the  pure  economic  analysis  would  suggest,  because  they  can  lead 
to  a  beneficial  change  in  the  pohtical  equilibrium.  One  such  example,  discussed  in  Acemoglu, 
Johnson  and  Robinson  (2005b),  is  the  possibility  that  Atlantic  trade  may  have  had  long-run 
beneficial  effects  in  Europe  mainly  by  changing  the  political  equilibrium  in  several  countries 
towards  more  participatory  regimes. 
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Appendix  (Not  for  Publication) 

The  Role  of  Theory  in  Instrumental- Variables  Strategies 

In  this  appendix,  I  illustrate  the  role  of  theory  further  using  the  same  example  from  Acemoglu 
and  Johnson  (2007)  used  in  the  text.  The  estimating  equation  for  income  per  capita  can  be 
alternatively  written  as 

Aj/j  =  TrAxi  +  A/i  +  Affi, 

where  A  denotes  the  difference  between  dates  ^o  and  ti  (for  example,  in  Acemoglu  and  John- 
son's estimation  1940  and  1980  or  2000).  Estimating  this  differenced  equation  would  (me- 
chanically) lead  to  identical  results  to  those  obtained  from  the  estimation  of  the  level  equation 
for  income  per  capita  presented  in  the  text.  However,  the  literature  on  the  cross-country 
relationship  between  health  and  income  sometimes  estimates  equations  of  the  form 

.  Ayi  =  ayuo  +  jix^o  +  ttAx^  -K  A/x  4-  Aej, 

where  xn^  is  the  initial  (1940)  level  of  log  life  expectancy  and  ijita  is  the  initial  (1940)  level  of 
income  per  capita.  It  may  then  be  tempting  to  estimate  this  modified  model  using  a  similar 
IV  strategy,  in  particular,  using  the  same  predicted  mortality  variable  (either  by  treating  the 
initial  levels  xn^  and  ynQ  as  exogenous  or  by  instrumenting  for  them  using  their  lagged  values  or 
geographic  controls).  The  reasoning  would  be  that  if  predicted  mortality  is  a  good  instrument 
for  our  original  estimation  exercise,  then  it  must  be  a  good  instrument  for  estimating  this 
modified  model.  But  this  reasoning  is  incorrect. 

As  already  noted  in  the  text,  we  need  both  the  second  and  the  first  stages  to  be  derived 
from  appropriate  (and  logically  consistent)  economic  models.  The  second  stage  in  Acemoglu 
and  Jolinson  was  derived  from  the  neoclassical  growth  model.  A  second  stage  equation  such  as 
the  one  in  this  modified  model,  with  initial  life  expectancy  on  the  right-hand  side,  could  also 
be  derived  from  theory,  for  example,  from  one  where  hfe  expectancy  in  1940  would  have  had  a 
direct  effect  on  productivity  in  1980  or  2000  (40  or  60  years  thereafter).  However,  the  estimation 
strategy  additionally  requires  a  theoretical  justification  for  the  first  stage,  or  the  exclusion 
restriction  (embedded  in  the  assumption  above  that  Cov(Mjt,eit)  =  0).  As  highhghted  above, 
this  exclusion  restriction  could  only  make  sense  if  the  baseline  level  of  mortality  does  not  have 
a  direct  effect  on  future  growth.  If  it  did,  the  assumption  that  Cov(Af,:(,  e^t)  =  0  would  be 
directly  violated.'-^    But  this  implies  that  the  theoretical  argument  underlying  the  exclusion 


This  may  or  may  not  be  a  valid  assumption  in  general.  As  noted  in  the  text,  Acemoglu  and  Johnson  provide 
evidence  to  substantiate  this  assumption  and  increase  the  plausibility  of  this  exclusion  restriction. 
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restriction  cannot  be  logically  combined  with  a  model  that  takes  the  form  of  this  modified 
equation,  even  though  it  is  entirely  consistent  with  the  original  model  we  started  with.  We 
thus  have  a  simple  example  where  one  needs  to  consider  the  theoretical  foundations  of  the 
entire  set  of  economic  relations  (or  more  explicitly,  the  first  and  second  stages)  together  in 
order  not  to  make  logical  errors. 

The  broader  point  is  that  one  cannot  think  of  "instruments"  without  theory.  What  makes 
a  particular  variable  a  valid  instrument  is  a  robust  theoretical  justification  for  the  entire  set 
of  economic  relationships  being  estimated,  that  is,  both  the  specification  of  the  structural  pa- 
rameters and  the  corresponding  first  stages  and  exclusion  restrictions.  When  either  of  these 
changes,  the  validity  of  the  instrument  may  be  jeopardized.  This,  of  course,  should  not  be  sur- 
prising in  view  of  the  discussion  we  started  with:  the  plausibility  of  structural  parameters,  and 
thus  their  estimation,  crucially  depends  on  using  economic  theory  to  derive  an  "appropriate" 
model  as  the  building  block  for  estimation.  In  this  light,  it  should  be  clear  that  there  cannot 
be  "instruments"  without  theory. 

Conversation  Between  Theory  and  Econometrics 

The  text  emphasized  the  importance  of  economic  theory  in  helping  us  specify  empirical  models 
with  some  degree  of  counterf actual  validity.  In  this  Appendix,  I  briefly  discuss  some  issues  that 
arise  in  attempts  to  implement  this.  Economic  theory  is  based  on  mathematical  models  that  are 
abstractions  of  reality.  Our  models  involve  several  assumptions,  many  of  which  are  adopted  for 
convenience  or  as  "simplifying"  assumptions.  With  all  of  its  assumptions,  a  model  implies  a  set 
of  relationships  between  difl"erent  variables.  But  we  may  not  necessarily  wish  to  take  all  of  these 
relationships  as  empirical  predictions  to  be  tested  or  used  for  counterfactual  analysis.  I  now 
suggest  that  we  ought  to  distinguish  between  the  key  implications  and  auxiliary  implications 
of  models,  though  in  practice  we  often  fail  to  do  so. 

Let  us  think  of  a  model  as  a  mapping  from  assmnptions  into  empirical  relationships.  For 
concreteness,  let  us  think  of  these  empirical  relationships  as  moments  in  the  data  (e.g.,  the 
conditional  covariance  of  one  variable  with  another),  and  focus  on  a  specific  problem:  the 
relationship  between  income  distribution,  credit  markets  and  occupational  choice,  for  example, 
studied  by  Banerjee  and  Newman  (1993).  Let  A  denote  the  set  of  assumptions  that  one  could 
make  in  modeling  this  problem.  An  element  A  G  A  corresponds  to  a  set  of  assumptions,  that  is, 
such  things  as  the  exact  specification  of  production  functions,  the  parameterization  of  the  joint 
distribution  of  talent  and  initial  wealth,  the  intertemporal  preferences  of  agents,  assumptions 
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concerning  how  the  credit  market  works,  and  assmnptions  on  conjectures,  expectations  and  the 
equilibriinn  concept.  Since  A  is  an  instance  of  a  complete  set  of  assumptions  for  the  problem 
at  hand,  we  can  think  of  this  as  a  "model".  This  model  A  then  generates  some  empirical 
impUcations.  I  will  summarize  these  by  a  set  of  moments,  denoted  by  M  G  A1,  which  could 
include  the  correlations  between  the  interest  rate,  the  occupational  distribution,  productivity 
and  initial  wealth.  Economic  theory  then  amounts  to  using  these  assumptions  in  order  to 
derive  empirical  relationships,  or  sets  of  moment  conditions.  Thus  we  can  think  of  economic 
theory  as  a  mapping  (correspondence)  f  :  A  ^  M,  specifying  which  set  of  moments  we  should 
expect  in  the  data  for  a  set  of  assumptions. 

The  difficulty  here  is  twofold.  First,  in  writing  down  a  model  A  £  A,  most  theorists,  rightly, 
will  go  for  a  minimalist  structure.  In  many  instances,  A  will  not  even  contain  any  stochastic 
elements.  For  example,  Banerjee  and  Newman's  model  leads  to  a  unique  nonstochastic  equi- 
librium for  most  parameter  values,  where  the  initial  distribution  of  wealth  together  with  an 
individual's  wealth  determines  his  and  his  dynasty's  occupational  choices.  To  generate  moment 
conditions  that  would  correspond  to  correlations  in  the  data,  one  would  then  have  to  add  an 
immodeled  error  term.  Although  one  could  interpret  this  error  term  as  coming  from  "mea- 
surement error,"  this  is  clearly  not  a  satisfactory  interpretation,  since  there  are  many  other 
factors  relevant  for  occupational  choices  not  captured  by  the  model  at  hand,  and  they  will  all 
be  subsumed  into  this  error  term,  though  they  are  not  in  reality  related  to  measurement  error. 
Yet  this  is  not  a  shortcoming,  but  rather  a  strength  of  the  model.  Banerjee  and  Newman's 
model  is  successful  largely  because  it  abstracts  from  several  features  of  the  world.  Second,  for 
the  same  reasons  of  parsimony  and  simplicity,  a  model  typically  involves  assumptions  that  are 
"auxiliary,"  meaning  that  they  are  made  for  convenience,  and  in  the  hope  that  they  are  not 
the  source  of  its  "main"  conclusions.  Naturally,  what  these  main  conclusions  are  is  not  always 
a  simple  matter  to  determine. 

This  issue  notwithstanding,  the  central  difficulty  here  is  that  the  set  of  moment  implications 
M  depends  on  the  entire  set  of  assumptions,  A.  Suppose,  for  example,  that  A  =  A'  U  A", 
where  A'  and  A"  are  two  disjoint  sets  of -assumptions,  and  those  in  the  set  A'  correspond  to  the 
"key"  assumptions,  while  A"  contains  the  auxiliary  and  simplifying  assumptions.  For  example, 
in  Banerjee  and  Newman,  the  assumption  that  all  individuals  have  the  same  ability  in  all 
occupations,  that  there  is  no  intensive  margin  of  production,  and  that  dynastic  saving  decisions 
are  "myopic"  are  auxiliary  assumptions.  These  assumptions,  taken  together,  lead  to  a  set  of 
predictions.  For  example,  taken  naively,  the  model  implies  that  there  will  exist  a  threshold  level 
of  wealth,  such  that  all  dynasties  with  initial  wealth  below  this  level  will  remain  in  subsistence 
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or  become  workers,  whereas  those  just  above  this  threshold  will  become  entrepreneurs.  If 
we  were  to  take  such  an  assumption  seriously,  it  would  lead  to  the  rejection  of  the  model, 
but  this  would  not  be  an  insightful  rejection.  Instead,  the  Banerjee  and  Ne-wrman  model  is 
insightful  because  it  highlights  how  the  credit  market  problems  create  a  link  between  wealth 
and  occupational  choice,  and  how  this  hnk  depends  on  factor  prices,  which  are  themselves 
endogenously  determined  by  the  entire  distribution  of  income. 

This  discussion  highlights  two  problems.  First,  the  insights  from  certain  models  may  be 
"conceptual"  and  thus  difficult  to  translate  into  moment  conditions.  For  example,  the  insight 
that  income  distribution  matters  for  occupational  choices  of  an  individual  (with  a  given  in- 
come level)  is  a  conceptual  point,  even  though  one  could  devise  tests  by  comparing  different 
economies  or  the  same  economy  over  time  in  order  to  investigate  the  degree  to  which  such  a 
link  is  present.  Second,  not  all  of  the  implications  of  the  model  should  be  taken  seriously.  The 
second  problem  suggests  that  we  may  wish  to  separate  the  set  of  moment  conditions,  Af ,  into 
two  disjoint  sets,  M  =  M'UAI",  so  that  M'  corresponds  to  the  set  of  "robust"  moment  predic- 
tions, which  we  should  test  or  use  as  guidance  for  empirical  work,  whereas  M"  corresponds  to 
the  moment  conditions  generated  by  the  "auxiliary"  assumptions.  However,  such  a  separation 
is  not  typically  possible,  since  each  moment  implication  of  the  model  is  potentially  generated 
by  all  of  the  assumptions  taken  together.  In  the  Banerjee  and  Newman  model,  for  example, 
we  cannot  simply  remove  the  assumptions  regarding  the  form  of  the  production  function  and 
still  obtain  moment  conditions  about  the  relationship  between  occupation  and  wealth. 

This  discussion  suggests  that  in  formulating  economic  theories  which  we  wish  to  apply  to 
data  (either  by  ourselves  or  by  others),  we  should  pay  special  attention  to  which  dimensions 
of  the  model  are  introduced  just  for  achieving  tractability  and  parsimony  (the  so-called  "aux- 
iUary")  assimiptions,  and  which  assumptions  and  implications  of  the  model  are  "robust"  and 
should  be  rehed  upon  and  used  empirically  (or  conceptually).  Unfortunately,  we  do  not  have 
the  theoretical  and  econometric  tools  to  achieve  this,^^  and  developing  such  tools,  or  at  the 
very  least,  trying  to  emphasize  in  specific  instances  which  predictions  are  more  robust,  would 
be  a  useful  direction  for  future  research.  In  addition,  even  though  such  tools  are  not  currently 
available,  in  specific  instances,  considerable  progress  is  possible.  I  will  now  iUustrate  this  using 
a  recent  paper  by  Weese  (2009). 

Weese  (2009)  studies  the  mergers  across  Japanese  municipahties.    Changes  in  Japanese 


''On  the  theory  side,  the  literature  on  robust  comparative  statics,  which  provides  qualitative  predictions 
for  a  range  of  models,  might  be  one  useful  direction.  Such  robust  comparative  statics  can  be  obtained  for 
environments  that  can  be  represented  as  supermodular  games  (Milgrom  and  Roberts,  1994,  Vives,  1990)  or  for 
those  that  can  be  represented  as  aggregative  games  (Acemoglu  and  Jensen,  2009). 
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government  policy  on  municipality  finance  in  1995  led  to  major  changes  in  municipality  struc- 
ture. Many  small  Japanese  municipalities  that  did  not  previously  have  incentives  to  merge, 
because  this  would  have  reduced  the  transfers  they  received  from  the  central  government,  were 
induced  to  merge  after  this  change  in  policy  and  the  number  of  municipalities  declined  from 
3232  to  1800.  Mergers  across  municipalities  are  important  for  public  finance  (because  they 
determine  the  type  and  amoimt  of  local  public  goods),  for  development  economics  (since  there 
are  marked  inequalities  across  municipalities  in  terms  of  income  and  provision  of  public  goods, 
e.g.,  Acemoglu  and  Dell,  2010),  and  for  political  economy  (as  they  are  a  major  example  of  en- 
dogenous coahtion  formation).  Weese  is  interested  in  estimating  the  "preferences"  of  different 
mimicipalities  concerning  mergers,  and  whether  given  these  preferences,  a  better  policy  could 
have  been  devised.  This  type  of  counterfactual  exercise  clearly  requires  structural  parameters 
in  which  we  can  have  some  confidence.  Thus  this  exercise  must  start  with  a  theoretical  model, 
which  will  then  be  estimated  to  obtain  structural  parameters  to  use  for  the  counterfactual  and 
pohcy  analysis. 

One  line  of  attack  would  be  to  specify  a  dynamic  or  static  game  of  coalition  formation, 
with  specific  assumptions  on  the  game  form.  But  these  specific  assumptions  will  translate  into 
different  predictions  on  which  coalitions  will  form  (which  mergers  will  take  place).  Thus  using  a 
specific  model,  one  could  typically  obtain  significantly  different  predictions  then  using  another 
related  model,  and  structural  estimation  of  each  of  these  models  is  likely  to  lead  to  very  different 
conclusions  because  auxiliary  assumptions,  such  as  those  related  to  the  order  in  which  offers 
are  made  and  how  different  equihbria  are  selected,  will  impact  implications  and  inference. ^'^ 
Instead,  Weese  adopts  a  different  approach,  more  in  line  with  the  type  of  conversation  between 
theory  and  empirics  suggested  here:  he  specifies  a  general  hedonic  coalitional  game,  where 
municipality  preferences  depend  on  a  few  characteristics  (average  income,  distance,  etc.),  and 
given  these  preferences  he  focuses  on  the  Von  Neumann-Morgenstern  stable  set.^'*  This  set 
is  not  a  singleton,  thus  the  model,  equipped  with  this  equilibrium/solution  concept,  does  not 
make  a  unique  prediction.  Nevertheless,  it  rules  out  a  large  set  of  mergers  given  underlying 
preferences,  and  thus  specifies  a  set  of.  moment  conditions  that  can  be  used  for  estimation. 
Crucially,  these  are  not  all  of  the  moment  conditions  that  will  follow  from  a  model  that  would 
make  additional  auxiliary  assumptions  to  specify,  say,  a  unique  equihbrium  merger  structure 


'  Acemoglu,  Egorov  and  Sonin  (2009)  consider  a  class  of  dynamic  coalition  formation  games,  where  "auxiliary" 
assumptions,  for  example,  those  concerning  the  order  in  which  offers  are  made  and  acceptance  and  voting 
procedures,  do  not  affect  the  set  of  predictions.  This  provides  another  example  of  a  strategy  to  obtain  "robust 
implications,"  even  though  such  results  can  only  be  obtained  under  certain,  somewhat  restrictive,  assumptions. 

'''See,  among  others,  Pakes  (2008)  and  Tamer  (2003),  for  different  approaches  to  the  estimation  of  models 
with  multiple  equilibria.  ,  ■ 
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for  every  value  of  the  underlying  parameter  vector.  Interestingly,  in  this  case,  Weese  is  able  to 
estimate  the  underlying  preferences  and  conduct  counterfactual  policy  analysis. -^^  His  estimates 
show  that  a  different  government  policy  would  have  led  to  better  (merger)  outcomes,  and  that, 
somewhat  surprisingly,  allowing  side  transfers  would  have  disadvantaged  poor  municipalities 
(because  their  willingness  to  merge  with  richer  municipalities  would  have  given  the  latter 
significant  bargaining  power). 

Overall,  one  important  direction  for  applied  theory  work  (in  economics  in  general  and  in 
development  economics)  would  be  to  carefully  delineate  which  sets  of  predictions  are  more 
robust,  and  thus  (policy)  invariant  to  auxiliary  assumptions,  and  develop  empirical  strategies 
and  methods  of  conducting  counterfactual  experiments  that  exploit  these  more  robust  impli- 
cations. In  the  meantime,  this  discussion  also  highlights  that  if  structural  estimation  relies  on 
all  of  the  moment  conditions  imphed  by  a  (simple)  model,  this  may  lead  to  misleading  results. 
Making  good  use  of  theory  does  not  mean  taking  all  of  the  predictions  of  a  model  seriously, 
but  to  make  use  of  the  key  and  robust  implications  from  theoretical  models  to  specify  and 
estimate  structural  parameters.  It  thus  also  requires  us  to  be  cognizant  of  which  dimensions 
of  a  model  are  adopted  just  for  simplicity,  tractability  and  convenience.  •  " 
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